Improved Automatic Keyword Extraction Based on TextRank Using Domain Knowledge

نویسندگان

  • Guangyi Li
  • Houfeng Wang
چکیده

Keyword extraction of scientific articles is beneficial for retrieving scientific articles of a certain topic and grasping the trend of academic development. For the task of keyword extraction for Chinese scientific articles, we adopt the framework of selecting keyword candidates by Document Frequency Accessor Variety(DF-AV) and running TextRank algorithm on a phrase network. To improve domain adaption of keyword extraction, we introduce known keywords of a certain domain as domain knowledge into this framework. Experimental results show that domain knowledge can improve performance of keyword extraction generally.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

TextRank: Bringing Order Into Texts

In this paper, we introduce TextRank – a graph-based ranking model for text processing, and show how this model can be successfully used in natural language applications. In particular, we propose two innovative unsupervised methods for keyword and sentence extraction, and show that the results obtained compare favorably with previously published results on established benchmarks.

متن کامل

DegExt - A Language-Independent Graph-Based Keyphrase Extractor

In this paper, we introduce DegExt, a graph-based languageindependent keyphrase extractor,which extends the keyword extraction method described in [6]. We compare DegExt with two state-of-the-art approaches to keyphrase extraction: GenEx [11] and TextRank [8]. Our experiments on a collection of benchmark summaries show that DegExt outperforms TextRank and GenEx in terms of precision and area un...

متن کامل

Automatic Generation of Personalized Annotation Tags for Twitter Users

This paper introduces a system designed for automatically generating personalized annotation tags to label Twitter user’s interests and concerns. We applied TFIDF ranking and TextRank to extract keywords from Twitter messages to tag the user. The user tagging precision we obtained is comparable to the precision of keyword extraction fromweb pages for content-targeted advertising.

متن کامل

Automatic Summarization for Terminology Recommendation: The Case of the NCBO Ontology Recommender

The National Center for Biomedical Ontology (NCBO) ontology recommender helps users choose a biomedical terminology by analyzing a submitted document. Submitting a single document might not be representative and result in poor recommendations, while submitting a large sample might be expensive, sometimes unfeasible. In this paper, we investigate the effectiveness of two well-researched automati...

متن کامل

Automatic Keyword Extraction Using Domain Knowledge

Documents can be assigned keywords by frequency analysis of the terms found in the document text, which arguably is the primary source of knowledge about the document itself. By including a hierarchically organised domain speciic thesaurus as a second knowledge source the quality of such keywords was improved considerably, as measured by match to previously manually assigned keywords.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014